label correction
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
Label Correction of Crowdsourced Noisy Annotations with an Instance-Dependent Noise Transition Model
The predictive ability of supervised learning algorithms hinges on the quality of annotated examples, whose labels often come from multiple crowdsourced annotators with diverse expertise. To aggregate noisy crowdsourced annotations, many existing methods employ an annotator-specific instance-independent noise transition matrix to characterize the labeling skills of each annotator. Learning an instance-dependent noise transition model, however, is challenging and remains relatively less explored. To address this problem, in this paper, we formulate the noise transition model in a Bayesian framework and subsequently design a new label correction algorithm.
A$^2$LC: Active and Automated Label Correction for Semantic Segmentation
Jeon, Youjin, Cho, Kyusik, Woo, Suhan, Kim, Euntai
Active Label Correction (ALC) has emerged as a promising solution to the high cost and error-prone nature of manual pixel-wise annotation in semantic segmentation, by actively identifying and correcting mislabeled data. Although recent work has improved correction efficiency by generating pseudo-labels using foundation models, substantial inefficiencies still remain. In this paper, we introduce A$^2$LC, an Active and Automated Label Correction framework for semantic segmentation, where manual and automatic correction stages operate in a cascaded manner. Specifically, the automatic correction stage leverages human feedback to extend label corrections beyond the queried samples, thereby maximizing cost efficiency. In addition, we introduce an adaptively balanced acquisition function that emphasizes underrepresented tail classes, working in strong synergy with the automatic correction stage. Extensive experiments on Cityscapes and PASCAL VOC 2012 demonstrate that A$^2$LC significantly outperforms previous state-of-the-art methods. Notably, A$^2$LC exhibits high efficiency by outperforming previous methods with only 20% of their budget, and shows strong effectiveness by achieving a 27.23% performance gain under the same budget on Cityscapes.
- Research Report > Promising Solution (0.68)
- Research Report > New Finding (0.46)
Learning to Clean: Reinforcement Learning for Noisy Label Correction
Heidari, Marzi, Zhang, Hanping, Guo, Yuhong
The challenge of learning with noisy labels is significant in machine learning, as it can severely degrade the performance of prediction models if not addressed properly. This paper introduces a novel framework that conceptualizes noisy label correction as a reinforcement learning (RL) problem. The proposed approach, Reinforcement Learning for Noisy Label Correction (RLNLC), defines a comprehensive state space representing data and their associated labels, an action space that indicates possible label corrections, and a reward mechanism that evaluates the efficacy of label corrections. RLNLC learns a deep feature representation based policy network to perform label correction through reinforcement learning, utilizing an actor-critic method. The learned policy is subsequently deployed to iteratively correct noisy training labels and facilitate the training of the prediction model. The effectiveness of RLNLC is demonstrated through extensive experiments on multiple benchmark datasets, where it consistently outperforms existing state-of-the-art techniques for learning with noisy labels.
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
Set a Thief to Catch a Thief: Combating Label Noise through Noisy Meta Learning
Wang, Hanxuan, Lu, Na, Zhao, Xueying, Yan, Yuxuan, Ma, Kaipeng, Keong, Kwoh Chee, Carneiro, Gustavo
X, SEPTEMBER XXXX 1 Set a Thief to Catch a Thief: Combating Label Noise through Noisy Meta Learning Hanxuan Wang, Na Lu, Xueying Zhao, Y uxuan Y an, Kaipeng Ma, Kwoh Chee Keong, Gustavo Carneiro Abstract --Learning from noisy labels (LNL) aims to train high-performance deep models using noisy datasets. Meta learning based label correction methods have demonstrated remarkable performance in LNL by designing various meta label rectification tasks. However, extra clean validation set is a prerequisite for these methods to perform label correction, requiring extra labor and greatly limiting their practicality. T o tackle this issue, we propose a novel noisy meta label correction framework STCT, which counterintuitively uses noisy data to correct label noise, borrowing the spirit in the saying " S et a T hief to C atch a T hief". The core idea of STCT is to leverage noisy data which is i.i.d. with the training data as a validation set to evaluate model performance and perform label correction in a meta learning framework, eliminating the need for extra clean data. By decoupling the complex bi-level optimization in meta learning into representation learning and label correction, STCT is solved through an alternating training strategy between noisy meta correction and semi-supervised representation learning. Extensive experiments on synthetic and real-world datasets demonstrate the outstanding performance of STCT, particularly in high noise rate scenarios. STCT achieves 96.9% label correction and 95.2% classification performance on CIF AR-10 with 80% symmetric noise, significantly surpassing the current state-of-the-art. I NTRODUCTION D EEP Deep learning has achieved great success in various fields, attributed to the availability of carefully annotated large scale datasets [1], [2], [3]. However, the collection of high quality datasets generally comes with high annotation cost and intensive human intervention, creating a significant obstacle to the development of deep learning. Fortunately, the annotation cost issue can be mitigated through web crawling [4] and crowdsourcing [5]. However, such low-cost datasets often contain a considerable amount of noisy labels, which may lead to severe overfitting of neural networks and performance degradation [6].
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Europe > United Kingdom > England > Surrey (0.04)
- Asia > Singapore (0.04)
Suppressing Uncertainty in Gaze Estimation
Uncertainty in gaze estimation manifests in two aspects: 1) low-quality images caused by occlusion, blurriness, inconsistent eye movements, or even non-face images; 2) incorrect labels resulting from the misalignment between the labeled and actual gaze points during the annotation process. Allowing these uncertainties to participate in training hinders the improvement of gaze estimation. To tackle these challenges, in this paper, we propose an effective solution, named Suppressing Uncertainty in Gaze Estimation (SUGE), which introduces a novel triplet-label consistency measurement to estimate and reduce the uncertainties. Specifically, for each training sample, we propose to estimate a novel ``neighboring label'' calculated by a linearly weighted projection from the neighbors to capture the similarity relationship between image features and their corresponding labels, which can be incorporated with the predicted pseudo label and ground-truth label for uncertainty estimation. By modeling such triplet-label consistency, we can measure the qualities of both images and labels, and further largely reduce the negative effects of unqualified images and wrong labels through our designed sample weighting and label correction strategies. Experimental results on the gaze estimation benchmarks indicate that our proposed SUGE achieves state-of-the-art performance.
Data Pruning Can Do More: A Comprehensive Data Pruning Approach for Object Re-identification
Yang, Zi, Yang, Haojin, Majumder, Soumajit, Cardoso, Jorge, Gallego, Guillermo
Previous studies have demonstrated that not each sample in a dataset is of equal importance during training. Data pruning aims to remove less important or informative samples while still achieving comparable results as training on the original (untruncated) dataset, thereby reducing storage and training costs. However, the majority of data pruning methods are applied to image classification tasks. To our knowledge, this work is the first to explore the feasibility of these pruning methods applied to object re-identification (ReID) tasks, while also presenting a more comprehensive data pruning approach. By fully leveraging the logit history during training, our approach offers a more accurate and comprehensive metric for quantifying sample importance, as well as correcting mislabeled samples and recognizing outliers. Furthermore, our approach is highly efficient, reducing the cost of importance score estimation by 10 times compared to existing methods. Our approach is a plug-and-play, architecture-agnostic framework that can eliminate/reduce 35%, 30%, and 5% of samples/training time on the VeRi, MSMT17 and Market1501 datasets, respectively, with negligible loss in accuracy (< 0.1%). The lists of important, mislabeled, and outlier samples from these ReID datasets are available at https://github.com/Zi-Y/data-pruning-reid.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Label Correction of Crowdsourced Noisy Annotations with an Instance-Dependent Noise Transition Model
The predictive ability of supervised learning algorithms hinges on the quality of annotated examples, whose labels often come from multiple crowdsourced annotators with diverse expertise. To aggregate noisy crowdsourced annotations, many existing methods employ an annotator-specific instance-independent noise transition matrix to characterize the labeling skills of each annotator. Learning an instance-dependent noise transition model, however, is challenging and remains relatively less explored. To address this problem, in this paper, we formulate the noise transition model in a Bayesian framework and subsequently design a new label correction algorithm. To theoretically characterize the distance between the noise transition model and the true instance-dependent noise transition matrix, we provide a posterior-concentration theorem that ensures the posterior consistency in terms of the Hellinger distance.
- Information Technology > Communications > Social Media > Crowdsourcing (0.89)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.42)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.42)
Learning with Instance-Dependent Noisy Labels by Anchor Hallucination and Hard Sample Label Correction
Huang, Po-Hsuan, Lin, Chia-Ching, Hsu, Chih-Fan, Chang, Ming-Ching, Chen, Wei-Chao
Learning from noisy-labeled data is crucial for real-world applications. Traditional Noisy-Label Learning (NLL) methods categorize training data into clean and noisy sets based on the loss distribution of training samples. However, they often neglect that clean samples, especially those with intricate visual patterns, may also yield substantial losses. This oversight is particularly significant in datasets with Instance-Dependent Noise (IDN), where mislabeling probabilities correlate with visual appearance. Our approach explicitly distinguishes between clean vs.noisy and easy vs. hard samples. We identify training samples with small losses, assuming they have simple patterns and correct labels. Utilizing these easy samples, we hallucinate multiple anchors to select hard samples for label correction. Corrected hard samples, along with the easy samples, are used as labeled data in subsequent semi-supervised training. Experiments on synthetic and real-world IDN datasets demonstrate the superior performance of our method over other state-of-the-art NLL methods.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Taiwan > Taiwan Province > Taipei (0.04)